Scale-free trees: the skeletons of complex networks 
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We investigate the properties of the spanning trees of various real-world and model networks. 
The spanning tree representing the communication kernel of the original network is determined by 
maximizing total weight of edges, whose weights are given by the edge betweenness centralities. 
We find that a scale-free tree and shortcuts organize a complex network. The spanning tree shows 
robust betweenness centrality distribution that was observed in scale-free tree models. It turns out 
that the shortcut distribution characterizes the properties of original network, such as the clustering 
coefficient and the classification of networks by the betweenness centrality distribution. 

PACS numbers: 89.75.Hc, 89.75.Da, 89.75.Fb, 05.10.-a 



Complex network theories have attracted much atten- 
tion in last few years with the advance in the understand- 
ing of the highly interconnected nature of various social, 
biological and communication systems Q, Q . The inho- 
mogeneity of network structures is conveniently charac- 
terized by the degree distribution Pd{k) , the probabil- 
ity for a vertex to have k edges toward other vertices. 
The emergence of scale-free distribution P c i{k) ~ k^ 1 
has been reported in many real-world networks, such 
as the coauthorship networks in social systems the 
metabolic networks and the protein interaction networks 
in biological systems 0, , and Internet and World Wide 
Web in technological systems 0, Q • 

It is important to study the dynamics on networks as 
well as to study the structural properties of networks 
since its application to the real-world. However, the dy- 
namical phenomena on networks such as traffic and in- 
formation flow are very difficult to predict from local in- 
formation due to rich microstructures and corresponding 
complex dynamics. Thus, to understand the dynamical 
phenomena on networks, one must know the global prop- 
erties of networks as well as the local properties such as 
degree distribution. It is the reason why the dynamics on 
complex networks has not been studied systematically so 
far. 

Due to their inhomogeneous structure, traffic or in- 
formation flow on complex networks would be also very 
inhomogeneous. As a simplified quantity to measure the 
traffic of networks, it is natural to use the betweenness 
centrality (BC) HHE3- The BC of G, either a vertex 
or an edge, is defined as 

KG) = £KU;G) = £^f^, (i) 

•¥j <*7 1 ,J> 

where c(i,j;G) denotes the number of shortest paths 
from a vertex i to j through G, and c(i,j) is the to- 
tal number of shortest paths from i to j. In terms of 
the packet in the Internet, assuming every vertex sends 
a unit packet to each of other vertices, BC is the average 
amount of packets passing though a vertex or an edge. 



In scale-free networks, the distribution of the vertex 
BC is known to follow the power-law with the exponents 
of either 2.2 or 2.0 Though the edge BC distribu- 

tion does not follow power-law exactly, the distribution 
of the ed ge B C is also very inhomogeneous in scale-free 
networks |l2j. This indicates that there exist extremely 
essential edges having large edge BCs which are used for 
communication very frequently. Thus, one can imagine a 
sub-network constructed only by the essential edges with 
global connectivity retained. We regard this network as a 
communication kernel, which handles most of the traffic 
on a network. 

For simplicity, we define the communication kernel of 
a network as the spanning tree with a set of edges maxi- 
mizing the summation of their edge BCs on the original 
networks. The constructing procedure is very similar to 
the minimum spanning tree algorithm [lflj . We repeat- 
edly select an edge according to the priority of the edge 
BC, and add the edge to the tree if it does not make any 
loop until the tree includes all vertices ^4|. Note that 
the residual edges can be regarded as the shortcuts since 
they shorten paths on the spanning tree. This concept 
of the spanning tree and shortcuts corresponds to that 
of 1-D regular lattice and shortcuts, respectively, in the 
small- world networks . 

In this paper, we investigate the structural and dynam- 
ical properties of the spanning tree of complex networks 
and the role of shortcuts in the networks. For various 
scale-free real-world and model networks, we find that 
the spanning trees show the scale- free behavior. We also 
find that the vertex and edge BC distributions follow the 
power-law with the robust exponent rj = 2.0, regardless 
of the exponent value r\ = 2.0 or 2.2 of original networks. 
On the other hand, it turns out that the shortcut length 
distribution shows either Gaussian-like or monotonically 
decaying behavior depending on the BC distribution ex- 
ponent rj of original networks. 

Firstly, we confirm the spanning tree to be a commu- 
nication kernel by estimating the relative importance of 
selected edges in the obtained spanning tree and those 
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TABLE I: The scaling exponents and correlation coefficients of the spanning trees and original networks for various real- world 
networks and models. Tabulated for each network is the system size N, the mean degree (A;), the ratio of edge BC summation 
over the edges selected for the spanning tree to total edge BC /, the ratio of the number of edges in spanning trees and original 
networks fo, the degree exponent 7, the BC exponent 77, the assortativity r, and the degree correlation coefficient r v between 
the original network and the spanning tree. The s subscripts indicate quantities for the spanning trees. Here we consider only 
the largest cluster of networks when network has several disjoint parts. 
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FIG. 1: Degree distributions of the spanning trees (O) an d 
their original networks (+), (a) BA model with m = 2, (b) 
coauthorship network, NEURO, (c) PIN, and (d) Internet AS. 
The data points are shifted vertically to enhance the visibility. 



from the random selection. If we select the edges ran- 
domly, the fraction / of the edge BC summation over 
the selected edges and that over total edge would be ap- 
proximately fo, the ratio of the number of edges in the 
tree and that of network. However, it turns out that the 
real set of selected edges from the spanning tree possesses 
over 50% of the total edge BC of the network (see Ta- 
ble||J), therefore / ^ fo- For instance, the coauthorship 
network shows that / is nearly three times larger than 
fo even though the number of edges in the spanning tree 
is only 16% of that in the original network. Thus we can 
call this spanning tree the communication kernel. 

To find out more about this kernel, we measure the 



degree distribution of the spanning trees. It turns out 
that the degree distribution always follows the power- 
law |l6|. which is tested for various networks including 
the Barabasi- Albert (BA) model 1171 . coauthorship net- 
work in neuroscience (NEURO) [lg, protein interaction 
networks of yeast (PIN) [l^| , Internet at the autonomous 
systems (AS) level 20], and so on (see Fig. ^ and Ta- 
ble nj. However, the details of the degree distribution 
depend on each of the networks. The exponents of the 
power-law degree distributions of the spanning trees do 
not always agree with those of the original networks (see 
Table QJ. This indicates that the spanning trees are far 
from the random sampling of edges. 

To confirm the scale- free behavior of the spanning tree, 
we investigate the time evolution of the degree in a grow- 
ing network. Assuming that the fixed number of new 
vertices are introduced at each time step in growing net- 
works, it is well known that the degree following the 
power-law ki(t) ~ t@ leads to the scale- free degree dis- 
tribution Pd{k) ~ fc -7 jUl, where ki{t) is the degree of 
the vertex i at time t and 7 = 1//3 + 1 . This argument 
can be naturally applied for the spanning tree of the BA 
model since it grows constantly. At each time step of the 
growth in the BA model, we obtain the spanning tree 
and measure the degree of every vertex. In Fig.^a), we 
show the time evolution of the degrees of several vertices. 
The degrees evolve with j3 = 0.58 that leads 7 S = 2.7 of 
the spanning tree, which agrees with our measurement 
from the actual degree distribution. 

The high correlation between the degrees from span- 
ning trees and the original networks also guarantees the 
preserved scale-free behavior of the spanning trees. The 
correlation coefficient between the degree of the origi- 
nal network k and the degree of its spanning trees k s is 
defined as the Pearson's correlation coefficient between 

Most networks 
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FIG. 2: (a) Time evolution for the degree of two vertices 
added to system at t = 5 (O) an< i t — 55 (□), where the 
dashed line is a linear fit with slope 0.58. (b) Scattered plot 
for degree of the original network (k) and the spanning tree 
(k s ). The dashed line has the slope of 1.08. 
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tree and its original network (see Table [IJ. We find 
that the degrees of networks (k) and their spanning 
trees (k s ) roughly follow k s ~ k a that leads the degree 
distribution of the spanning trees Pd(k s ) ~ k~ 7s with 
7 S = (7 + a — 1)/ a. In Fig. |2Jb), we show that the BA 
model has a = 1.08, which leads 7 S = 2.75 in good agree- 
ment with the result obtained by direct measurement. 

The assortativity is another interesting feature of the 
spanning trees. The assortativity r 22], that measures 
the degree correlation of vertices directly connected by an 
edge, is defined by r = ^l^y}yi 7 where j and k are the 
remaining degrees at the end of an edge and the angular 
brackets indicate the average over all edges. We find that 
all spanning trees show dissortative or neutral behavior 
regardless of the assortativity of original networks (see 
Tabled). Thus, we can propose that it is general charac- 
teristics of the spanning trees of scale-free networks. We 
need further study to prove our conjecture. 

We find that the BC distribution of the spanning tree 
is robust regardless of its original networks. For both of 
vertices and edges, the BC distribution follow the power- 
law with the robust exponent 77,5 = 2.0 in all spanning 
trees we studied (see Fig. |3 and Table OJ- This is consis- 
tent with the numerical results for the known scale-free 
tree models ^lj- The same BC distribution for vertices 
and edges is the general feature of trees. In the mean 
field picture, the largest BC of edges belonging to a ver- 
tex gives dominant contribution to the BC of the vertex 
\'2?\ . For our obtained spanning trees, we verify numeri- 
cally that the largest edge BC of a vertex almost equals 
to the vertex BC for most of vertices (See Fig.EJc)). 

The spanning trees show the robust features, such as 
scale-free degree distribution, robust BC distribution, 
and dissortative or neutral degree correlation. Here one 
can ask what is the role of shortcuts which are not in- 
cluded in the spanning tree. To answer this question, 
we focus on the length of the shortcuts on the spanning 
trees. The length of a shortcut between vertices i and j 
is defined as the minimum number of hops from i to j 



FIG. 3: The vertex BC distribution of the original networks 
(O) an d the spanning tree (□) for (a) the BA model averaged 
over 10 ensembles and (b) Internet AS. In (a), the solid and 
dashed line have the slopes of 2.0 and 2.2, respectively. The 
lines in (b) are linear fits with slope 2.0. The data points are 
shifted vertically to enhance the visibility, (c) The ratio of the 
largest value of edge BC {b max ) to vertex BC (b) of a vertex 
with degree k„ for the BA tree (+) and the spanning trees of 
the BA model (x), NEURO (□), Internet AS (O), and PIN 
(A) networks. 



on the spanning tree. The non-zero clustering coefficient 
of the original networks can now be explained by short- 
length shortcuts. Obviously, shortcuts with the length 2 
build triangles of vertices, hence increase the clustering 
coefficient. All networks with non- vanishing clustering 
coefficient have the significant amount of the shortcuts 
with the length 2 (see Fig.^J. 

Interestingly, we find that there are two types in the 
shortcut length distribution (see Fig. In one distri- 
bution (Type I), most shortcuts distribute near a large 
mean value, similar to the Gaussian distribution, which 
shows that the network is the longer-loop dominant struc- 
ture. In the other distribution (Type II), the number 
of shortcuts monotonically decreases as the length in- 
creases, which indicates that the network is tree-like. 
Most of networks including the BA model, coauthorship 
networks, and PIN belong to the type I. On the other 
hand, Internet AS and the adaptation model are type II. 
We find that our classification exactly agrees with the 
grouping by the exponent of the BC distribution [Tl| . 
The networks belonging to type I or type II show vertex 
BC distributions with the exponents of 2.2 or 2.0, respec- 
tively. Goh, et al. [ll| characterized the networks with 
the BC exponent of 2.0 as the linear mass-distance rela- 
tion, which shows that the shortest paths of the networks 
are similar to trees. Our result also supports the tree-like 
structures of the type II networks and give an intuitive 
explanation of the reason why the BC exponents of the 
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FIG. 4: The length distribution of shortcuts for (a) the 
BA model (m=2), (b) coauthorship network of neuroscience, 
(c) Internet AS, and (d) adaptation model with 10 vertices. 
n a (d) and no are the number of shortcuts with the length d 
and total number of shortcuts, respectively. 

type II networks are as same as those of the scale-free 
trees. Because there exists mostly short length shortcuts 
in the type II networks with monotonically fast-decaying 
shortcut length distribution, the structure of the original 
networks are not significantly different from their span- 
ning trees. Therefore, The BC exponents of the type II 
networks are unchanged at 2.0 of their spanning trees. 

In summary, we study the properties of the spanning 
trees with maximum total edge betweenness centrality, 
which is regarded as the communication kernel on net- 
works. We find that a complex network can be decom- 
posed into a scale-free tree and additional shortcuts on 
it. The scale-free trees show robust characteristics in the 
betweenness centrality distribution and the degree corre- 
lation. The remaining shortcuts are responsible for the 
detailed characteristics of the networks such as the clus- 
tering property and the BC distribution. The distribu- 
tion of the shortcut length clearly distinguishes the net- 
work into the two types, which coincides with the classes 
determined from the BC exponents 
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